Search results for "clustering initialization"

showing 2 items of 2 documents

Improving Scalable K-Means++

2021

Two new initialization methods for K-means clustering are proposed. Both proposals are based on applying a divide-and-conquer approach for the K-means‖ type of an initialization strategy. The second proposal also uses multiple lower-dimensional subspaces produced by the random projection method for the initialization. The proposed methods are scalable and can be run in parallel, which make them suitable for initializing large-scale problems. In the experiments, comparison of the proposed methods to the K-means++ and K-means‖ methods is conducted using an extensive set of reference and synthetic large-scale datasets. Concerning the latter, a novel high-dimensional clustering data generation …

random projectionlcsh:T55.4-60.8K-means++algoritmitclustering initializationalgoritmiikkalcsh:Industrial engineering. Management engineeringklusterianalyysilcsh:Electronic computers. Computer sciencetiedonlouhintaK-means‖lcsh:QA75.5-76.95
researchProduct

Improvements and applications of the elements of prototype-based clustering

2018

Clustering or cluster analysis is an essential part of data mining, machine learning, and pattern recognition. The most popularly applied clustering methods are partitioning-based or prototype-based methods. Prototype-based clustering methods usually have easy implementability and good scalability. These methods, such as K-means clustering, have been used for different applications in various fields. On the other hand, prototype-based clustering methods are typically sensitive to initialization, and the selection of the number of clusters for knowledge discovery purposes is not straightforward. In the era of big data, in high-velocity, ever-growing datasets, which can also be erroneous, outl…

random projectionparallel computingknowledge discoveryclustering initializationminimal learning machinedata miningprototype-based clusteringmachine learningkoneoppiminenbig datarinnakkaiskäsittelyklusterianalyysitiedonlouhintarobust clusteringK-means
researchProduct